How to use Mastodon's RSS feed as a source for a (static) blog

Sun Jan 19 2020 00:00:00 GMT+0000 (Coordinated Universal Time)

If you publish to Mastodon, you can publish stuff further by parsing the RSS feed of your account.

The naïve and quick way of doing this is to just parse the RSS and create one file (or two if there is an attachment) per item and bam, done.

Edits

However what if an item has been edited? The way Mastodon deals with this is to delete the old item and issue a new one with a new guid.

If your blog is going to stay consistent with this, it has to delete the old item as well. Ok, easy-peasy, just delete files that are not in the RSS feed. But wait, maybe older files are not anymore covered by the RSS feed. Feeds often limit their contents either to a maximum number of items or to newer items only.

So the code has to detect if an item on disk is missing from the feed, but also check if the file is new enough that it ought to be covered by the feed. That can be done by checking if the file is older than the last item in the feed, in which case we should probably keep the file.

Timestamps

But how do you get info on what timestamp the file has? It must have been stored in connection with the file, it could be in the front matter, it could be in a separate index file or it could be encoded into the file's filename.

If you put it into a separate file, one needs the info in that to be consistent with the files, i.e. one needs to do it in an atomic way. Too much work!

If you put it into the front matter of a file, then all files must be opened and read. So the easiest thing seems to be to put it in the file name. Then as the feed is loaded, you can check for the oldest item in the feed, and disregard all files whose filenames indicate an older creation date than that.